probabilities. After that, whenever a patient is newly diagnosed with cancer, you can take that person’s
age, stage, and grade, and generate an expected survival curve tailored for that particular patient. (The
patient may not want to see it, but at least it could be done.)
You’ll probably have to do these calculations outside of the software that you use for the
survival regression, but the calculations aren’t difficult and can be done in a Microsoft Excel
spreadsheet. The example in the following sections uses the small set of sample data that’s
preloaded into the online calculator for PH regression at
https://statpages.info/prophaz.html. This particular example has only one predictor, but
the basic idea extends to multiple predictors.
Obtaining the necessary output
Figure 23-6 shows the output from the built-in example (omitting the Iteration History and Overall
Model Fit sections). Pretend that this model represents survival, in years, as a function of age for
patients just diagnosed with some particular disease. In the output, the age variable is called Variable
1.
FIGURE 23-6: Output of PH regression for generating prognostic curves.
Looking at Figure 23-6, first consider the table in the Baseline Survivor Function section, which has
two columns: time in years, and predicted survival expressed as a fraction. It also has four rows —
one for each time point in which one or more deaths was actually observed. The baseline survival
curve for the example data starts at 1.0 (100 percent survival) at time 0, as survival curves always do,
but this row isn’t shown in the output. The survival curve remains flat at 100 percent until year two,
when it suddenly drops down to 99.79 percent, where it stays until year seven, when it drops down to
98.20 percent, and so on.
In the Descriptive Stats section near the start of the output in Figure 23-6, the average age of the 11